282 research outputs found

    Exploring subdomain variation in biomedical language.

    Get PDF
    BACKGROUND: Applications of Natural Language Processing (NLP) technology to biomedical texts have generated significant interest in recent years. In this paper we identify and investigate the phenomenon of linguistic subdomain variation within the biomedical domain, i.e., the extent to which different subject areas of biomedicine are characterised by different linguistic behaviour. While variation at a coarser domain level such as between newswire and biomedical text is well-studied and known to affect the portability of NLP systems, we are the first to conduct an extensive investigation into more fine-grained levels of variation. RESULTS: Using the large OpenPMC text corpus, which spans the many subdomains of biomedicine, we investigate variation across a number of lexical, syntactic, semantic and discourse-related dimensions. These dimensions are chosen for their relevance to the performance of NLP systems. We use clustering techniques to analyse commonalities and distinctions among the subdomains. CONCLUSIONS: We find that while patterns of inter-subdomain variation differ somewhat from one feature set to another, robust clusters can be identified that correspond to intuitive distinctions such as that between clinical and laboratory subjects. In particular, subdomains relating to genetics and molecular biology, which are the most common sources of material for training and evaluating biomedical NLP tools, are not representative of all biomedical subdomains. We conclude that an awareness of subdomain variation is important when considering the practical use of language processing applications by biomedical researchers

    Unsupervised Morphology-Based Vocabulary Expansion

    Get PDF
    Abstract We present a novel way of generating unseen words, which is useful for certain applications such as automatic speech recognition or optical character recognition in low-resource languages. We test our vocabulary generator on seven low-resource languages by measuring the decrease in out-of-vocabulary word rate on a held-out test set. The languages we study have very different morphological properties; we show how our results differ depending on the morphological complexity of the language. In our best result (on Assamese), our approach can predict 29% of the token-based out-of-vocabulary with a small amount of unlabeled training data

    Unsupervised Morphology-Based Vocabulary Expansion

    Get PDF
    Abstract We present a novel way of generating unseen words, which is useful for certain applications such as automatic speech recognition or optical character recognition in low-resource languages. We test our vocabulary generator on seven low-resource languages by measuring the decrease in out-of-vocabulary word rate on a held-out test set. The languages we study have very different morphological properties; we show how our results differ depending on the morphological complexity of the language. In our best result (on Assamese), our approach can predict 29% of the token-based out-of-vocabulary with a small amount of unlabeled training data

    Scintillation efficiency measurement of Na recoils in NaI(Tl) below the DAMA/LIBRA energy threshold

    Full text link
    The dark matter interpretation of the DAMA modulation signal depends on the NaI(Tl) scintillation efficiency of nuclear recoils. Previous measurements for Na recoils have large discrepancies, especially in the DAMA/LIBRA modulation energy region. We report a quenching effect measurement of Na recoils in NaI(Tl) from 3keVnr_{\text{nr}} to 52keVnr_{\text{nr}}, covering the whole DAMA/LIBRA energy region for light WIMP interpretations. By using a low-energy, pulsed neutron beam, a double time-of-flight technique, and pulse-shape discrimination methods, we obtained the most accurate measurement of this kind for NaI(Tl) to date. The results differ significantly from the DAMA reported values at low energies, but fall between the other previous measurements. We present the implications of the new quenching results for the dark matter interpretation of the DAMA modulation signal

    Autophagy: A Forty-Year Search for a Missing Membrane Source

    Get PDF
    Autophagy is central to diverse biological processes in eukaryotes including animal development and cellular survival, and also to neurodegenerative diseases, but the origin of the membranes that make up autophagic vesicles is unknown

    Snowmass 2021 Cross Frontier Report: Dark Matter Complementarity (Extended Version)

    Full text link
    The fundamental nature of Dark Matter is a central theme of the Snowmass 2021 process, extending across all frontiers. In the last decade, advances in detector technology, analysis techniques and theoretical modeling have enabled a new generation of experiments and searches while broadening the types of candidates we can pursue. Over the next decade, there is great potential for discoveries that would transform our understanding of dark matter. In the following, we outline a road map for discovery developed in collaboration among the frontiers. A strong portfolio of experiments that delves deep, searches wide, and harnesses the complementarity between techniques is key to tackling this complicated problem, requiring expertise, results, and planning from all Frontiers of the Snowmass 2021 process.Comment: v1 is first draft for community commen

    On the abundance of non-cometary HCN on Jupiter

    Full text link
    Using one-dimensional thermochemical/photochemical kinetics and transport models, we examine the chemistry of nitrogen-bearing species in the Jovian troposphere in an attempt to explain the low observational upper limit for HCN. We track the dominant mechanisms for interconversion of N2-NH3 and HCN-NH3 in the deep, hightemperature troposphere and predict the rate-limiting step for the quenching of HCN at cooler tropospheric altitudes. Consistent with other investigations that were based solely on time-scale arguments, our models suggest that transport-induced quenching of thermochemically derived HCN leads to very small predicted mole fractions of hydrogen cyanide in Jupiter's upper troposphere. By the same token, photochemical production of HCN is ineffective in Jupiter's troposphere: CH4-NH3 coupling is inhibited by the physical separation of the CH4 photolysis region in the upper stratosphere from the NH3 photolysis and condensation region in the troposphere, and C2H2-NH3 coupling is inhibited by the low tropospheric abundance of C2H2. The upper limits from infrared and submillimeter observations can be used to place constraints on the production of HCN and other species from lightning and thundershock sources.Comment: 56 pages, 0 tables, 6 figures. Submitted to Faraday Discussions [in press

    The use of technology in group-work: a Situational Analysis of students’ reflective writing

    Get PDF
    Group work is a powerful constructivist pedagogy for facilitating students’ personal and professional development, but it can be difficult for students to work together in an academic context. The assessed reflective writings of undergraduate students studying Information Management are used as data in this exploration of the group work situation and what matters to students in terms of ensuring success. Situational Analysis provides the methodological framework and a number of mapping techniques are used to interrogate the data. Students reflect on the importance of communication for group work and identify the convivial tools they use when arranging meetings, working collaboratively and producing outputs. Students valued the instant communication facilitated by smart phones, but despite the immediacy of electronic communication, face-to-face meetings are still highly valued. Silences in the data reveal the lack of engagement with the Virtual Learning Environment as a tool for group collaboration. Implications for educators in supporting group work are identified

    Identification of Radiopure Titanium for the LZ Dark Matter Experiment and Future Rare Event Searches

    Full text link
    The LUX-ZEPLIN (LZ) experiment will search for dark matter particle interactions with a detector containing a total of 10 tonnes of liquid xenon within a double-vessel cryostat. The large mass and proximity of the cryostat to the active detector volume demand the use of material with extremely low intrinsic radioactivity. We report on the radioassay campaign conducted to identify suitable metals, the determination of factors limiting radiopure production, and the selection of titanium for construction of the LZ cryostat and other detector components. This titanium has been measured with activities of 238^{238}Ue_{e}~<<1.6~mBq/kg, 238^{238}Ul_{l}~<<0.09~mBq/kg, 232^{232}The_{e}~=0.28±0.03=0.28\pm 0.03~mBq/kg, 232^{232}Thl_{l}~=0.25±0.02=0.25\pm 0.02~mBq/kg, 40^{40}K~<<0.54~mBq/kg, and 60^{60}Co~<<0.02~mBq/kg (68\% CL). Such low intrinsic activities, which are some of the lowest ever reported for titanium, enable its use for future dark matter and other rare event searches. Monte Carlo simulations have been performed to assess the expected background contribution from the LZ cryostat with this radioactivity. In 1,000 days of WIMP search exposure of a 5.6-tonne fiducial mass, the cryostat will contribute only a mean background of 0.160±0.0010.160\pm0.001(stat)±0.030\pm0.030(sys) counts.Comment: 13 pages, 3 figures, accepted for publication in Astroparticle Physic
    • 

    corecore